Regularization and Averaging of the Selective Naive Bayes classifier

نویسنده

  • Marc Boullé
چکیده

Naïve Bayes classifier has proved to be very effective on many real data applications. Its performances usually benefit from an accurate estimation of univariate conditional probabilities and from variable selection. However, although variable selection is a desirable feature, it is prone to overfitting. In this paper, we introduce a new regularization technique to select the most probable subset of variables and propose a new model averaging method. The weighting scheme on the models reduces to a weighting scheme on the variables, and finally results in a Naïve Bayes with "soft variable selection". Extensive experimental results show that the averaged regularized classifier outperforms the initial Selective Naïve Bayes classifier.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Compression-Based Averaging of Selective Naive Bayes Classifiers

The naive Bayes classifier has proved to be very effective on many real data applications. Its performance usually benefits from an accurate estimation of univariate conditional probabilities and from variable selection. However, although variable selection is a desirable feature, it is prone to overfitting. In this paper, we introduce a Bayesian regularization technique to select the most prob...

متن کامل

A New Approach for Text Documents Classification with Invasive Weed Optimization and Naive Bayes Classifier

With the fast increase of the documents, using Text Document Classification (TDC) methods has become a crucial matter. This paper presented a hybrid model of Invasive Weed Optimization (IWO) and Naive Bayes (NB) classifier (IWO-NB) for Feature Selection (FS) in order to reduce the big size of features space in TDC. TDC includes different actions such as text processing, feature extraction, form...

متن کامل

L1/Lp Regularization of Differences

In this paper, we introduce L1/Lp regularization of differences as a new regularization approach that can directly regularize models such as the naive Bayes classifier and (autoregressive) hidden Markov models. An algorithm is developed that selects values of the regularization parameter based on a derived stability condition. for the regularized naive Bayes classifier, we show that the method ...

متن کامل

Bayesian Model Averaging for Improving Performance of the Naïve Bayes Classifier

Feature selection has proved to be an effective way to reduce the model complexity while giving a relatively desirable accuracy, especially, when data is scarce or the acquisition of some feature is expensive. However, the single selected model may not always generalize well for unseen test data whereas other models may perform better. Bayesian Model Averaging (BMA) is a widely used approach to...

متن کامل

The Indifferent Naive Bayes Classifier

The Naive Bayes classifier is a simple and accurate classifier. This paper shows that assuming the Naive Bayes classifier model and applying Bayesian model averaging and the principle of indifference, an equally simple, more accurate and theoretically well founded classifier can be obtained. Introduction In this paper we use Bayesian model averaging and the principle of indifference to derive a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006